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CORRECTED APPEAL BRIEF 

Sir: 

This Appeal Brief is submitted in support of the Notice of Appeal filed May 30, 

2003. 

REAL PARTY IN INTEREST 

International Business Machines Corporation is the real party in interest as 
assignee of the subject application. 
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RELATED APPEALS AND INTERFERENCES 

The Appellant, the Appellant's legal representative, and the Assignee are not aware of 
any other appeals or interferences which will directly affect, be directly affected by, or 
have a bearing on the Board's decision in this Appeal. 

STATUS OF CLAIMS 

Claims 1 and 4 through 15 (the claims at issue) are pending in the above-identified 
patent application. The claims at issue were finally rejected in an Office Action dated 
January 27, 2003. The final rejection of the claims at issue is hereby appealed. 

The claims at issue all stand rejected under 35 U.S.C. § 101 as being directed to non- 
statutory subject matter. Claim 1 has been rejected under 35 U.S.C. § 102(e)(2) as 
being anticipated by Piatt et al (U.S. Patent 5,784,294, hereafter "Piatt"). The final 
Office Action rejected claims 1 and 4, under 35 U.S.C. § 112, first paragraph, as 
containing subject matter that was not described in the original specification as to covey 
to one skilled in the art that the inventor possessed the claimed invention. 

STATUS OF AMENDMENTS 

The above-identified patent application was filed on March 24, 1999. An Office Action 
(Paper No. 3) was issued on September 13, 2001, rejecting claims 1 through 30. On 
November 2, 2001 an Amendment was filed in response to the Office Action, wherein 
claims 16 through 30 were cancelled. A Non-Final Office Action was issued on August 
8, 2002 rejecting claims 1 through 15. In response to that Office Action applicants filed 
an amendment on November 13, 2002 amending claims 1, 4, and 5 and canceling 
claims 2 and 3. On January 27, 2003 the Examiner issued a final Office Action rejecting 
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all of the claims at issue. On May 30, 2003 a Notice of Appeal to the Board of Appeals 
was filed. 

SUMMARY OF INVENTION 

The present invention, as set forth in claim 1 , and as described and shown in the 
specification and the Figures of the above-identified patent application, is directed to a 
method for generating and storing data characterizing at least one region of said 
plurality of regions, the method comprises the steps of: 

generating an entry [ page 3, line 18] comprising i) an identifier that identifies said 
at least one region, and ii) data characterizing a set of axes derived from a property 
distribution of said at least one region [page 3, lines 19-20]; 

applying a mapping to the descriptor vector associated with said at least one 
region [page 3, lines 20-21] based on preselected criteria [page 3, lines 10-13]; 

generating a key that corresponds to said mapping of the descriptor vector 
associated with said at least one region [page 3, lines 20-21]; and 

storing said entry in a memory [page 3, line 21], wherein said key is associated 
with said entry such that the key indexes the entry for retrieval thereof [page 4, lines 2- 
3]. A concept underlying the claimed invention is the storage of data in groupings that 
are sensitive to the way a human would search for stored information, thus facilitating 
retrieval of the stored data in a way that is useful for using the molecules. 

ISSUES 

I. Whether the Examiner erred in rejecting claims 1 and 4-15 as being directed 
. to non-statutory subject matter. 

II. Whether the Examiner erred in rejecting claim 1 under 35 U.S.C.§1 02(e)(2). 
as being anticipated by U.S. Patent Number 5,784,294 (Piatt). 

III. Whether the Examiner erred in rejecting claims 1 and 4 under 35 U.S.C. 
§112, first paragraph, as containing subject matter which was not described in 



Docket No. YOR9 199801 12 

Serial No. 09/275,568 



the specification in such a way as to reasonably convey to one skilled in the 
art that the inventor possessed the invention. 

GROUPING OF CLAIMS 

For purposes of this Appeal, the claims at issue stand or fall together. 

ARGUMENT 

I. CLAIM REJECTIONS UNDER 35 U.S.C. § 101 

The Examiner erred in rejecting the claims at issue under 35 U.S.C. §101 on 
grounds that the claimed invention is allegedly directed to non-statutory subject matter. 

The analysis of whether a claim is directed to statutory subject matter begins with the 
language of 35 U.S.C. 101, which reads: 

"Whoever invents or discovers any new and useful process, machine, manufacture, or 
composition of matter, or any new and useful improvement thereof, may obtain a patent 
therefor, subject to the conditions and requirements of this title." 

In AT&T Corp. v. Excel Communications. Inc. . 172 F.3d 1352, 50 USPQ2d 1447 (Fed. 
Cir. 1999), the United States Court of Appeals for the Federal Circuit said that the 
Supreme Court has construed 101 broadly, noting that Congress intended statutory 
subject matter to "include anything under the sun that is made by man." See Diamond 
v. Chakrabartv . 447 U.S. 303, 309 (1980) (quoting S. Rep. No. 82-1979, at 5 (1952); 
H.R. Rep. No. 82-1923, at 6 (1952)); see also Diamond v. Diehr . 450 U.S. 175, 182 
(1981). Notwithstanding the broad scope statutory subject matter, the Court has 
specifically identified three categories of unpatentable subject matter: "laws of nature, 
natural phenomena, and abstract ideas." See Diehr . 450 U.S. at 185. 
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In this appeal, all of the claims at issue are method claims which fall within the "process" 
category of the four enumerated categories of patentable subject matter in 101. The 
examiner determined that the claims at issue recite a method that "is merely arranging 
the data based on an algorithm without any practical application." 

The subject patent application claims methods or processes for generating and storing 
data. Moreover these processes are performed by a machine (a data processing 
system). The data are expressed and processed as electrical signals operated upon by 
a processing apparatus. That is a practical application of the invention. The claim also 
states in the storing step that the key is associated with the subject data entry for 
retrieval thereof. 

The Examiner's determination constitutes an error of law. The issue of failure to claim 
statutory subject matter is one of law that is reviewed de novo. See AT&T Corp. v. 
Excel Communications, Inc. , supra. 

The storage of data in a computer memory is by itself a concrete and useful result. 
Claim 1 which is representative includes four steps that precisely set forth how the 
subject information is stored in memory. In In re Lowrv , 32 F. 3d 1579, 32 USPQ2d 
1031 (Fed. Cir. 1994), the Federal Circuit held that claims directed to an invention 
related to "storage, use, and management of information residing in a memory were 
entitled to patentable weight." In Lowrv , the Board of Patent Appeals and Interferences 
reversed the statutory subject matter rejections made by the Examiner under 35 U.S.C. 
§101. The Federal Circuit then went further by holding that claim limitations relating to 
the storage of information in a computer memory are entitled to patentable weight in 
distinguishing the prior art. The method claimed in the instant application is similar to 
the storage of data in Lowrv . Storage of information in a computer memory is an 
important aspect of information technologies because information processing apparatus 
must read data from computer memory to execute operations on that information. 
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Considering the claims at issue as a whole, as they must be, it becomes clear that the 
information stored in a memory has a practical purpose beyond the mere storage of the 
information - retrieval of the stored information based on the key mapped to a 
descriptor vector. The ability to retrieve information from memory based on various 
criteria is perhaps as important as storage of the information. 

The technology of search engines which is the subject of numerous patents is 
concerned with this very concept. Failure to provide patent protection to inventions in 
the art of data retrieval would violate the constitutional mandate of promoting the 
progress of science and the useful arts. Therefore, the rejection of the claims at issue 
for failure to recite statutory subject matter must be reversed. 

In the final office action dated January 27, 2003, the Examiner rejected appellants' 
arguments and reasserted the position that the claims are directed to unpatentable 
subject matter. The Examiner thus contended that: "An invention where a system 
merely stores data such as descriptor vectors associated with a plurality of regions of 
molecules onto a media [sic, medium] is considered to be non-statutory subject matter 
because the said data is considered to be nonfunctional descriptive material." This 
rejection is nothing more than an application of the "printed matter" category of non- 
patentable subject matter that was so clearly discredited and reversed in the In re Lowrv 
decision. As noted above, the general rule of patentable subject matter is expansive 
and any determination of failure to claim statutory subject matter must find its support in 
the case law. In the final office action the Examiner concedes that the claimed invention 
does not lack utility under section 101 of the patent statute. Therefore, the rejection is 
either based on the printed matter exception or on the algorithm exception to the rule of 
patentability. As noted above, storage of data (which is by its nature descriptive) is a 
very important aspect of the information technology arts. That fact was recognized in in 
re Lowrv when the Federal Circuit laid to rest the doctrine of printed matter as applied to 
data stored in computer readable media. The Federal Circuit's decision in In re Lowrv 
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requires reversal of the Examiner's determination of failure to claim statutory subject 
matter. 

To the extent that the Examiner's section 101 rejection relied on the mathematical 
algorithm doctrine it must also be reversed. In AT&T , supra , the Federal Circuit said 
that any step-by-step process involves an algorithm in the broad sense of the word. 
The AT&T court thus said: "Since the process of manipulation of numbers is a 
fundamental part of computer technology, we have had to reexamine the rules that 
govern the patentability of such technology. The sea-changes in both law and 
technology stand as a testament to the ability of law to adapt to new and innovative 
concepts, while remaining true to basic principles." AT&T , 172 F.3d at 1356. Thus the 
AT&T court limited the "Algorithm" doctrine to apply only in cases of purely "abstract" 
algorithms. See AT&T at 1357. In AT&T , the Federal Circuit also said that the 
algorithm must be applied in a useful way and found a practical result in the claimed 
methods in the addition of certain descriptive information called a PIC (or primary 
interchange carrier) to certain other information used in switching telephone calls. The 
information in the claims at issue in the instant case also has a useful result - the 
storage for retrieval of information from a computer memory responsive to a search for 
certain criteria. The retrieved information is useful for, among other purposes, 
determining properties of molecules. 

The Examiner's argument that the information stored according to the claims at issue is 
merely descriptive if applied to the field of photography would preclude the patentability 
of cameras because cameras take light, one form of information that represents an 
object, and record the information in film. The information is merely descriptive of the 
subject of the photograph. The application of the Examiner's reasoning to the clearly 
patentable area of photograph illustrates the point that the claimed invention which is 
analogous to other forms of data storage should not be precluded from patentability. 
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The Federal Circuit rejected an argument similar to the Examiner's Arrhythmia 
Research Technology. Inc. v. Corazonix Corp ., 22 USPQ2d 1033 (Fed. Cir. 1992), 
where processing information describing a patient's heartbeat was held to be statutory 
subject matter. The court there said that the claims at issue did not preempt all uses of 
the algorithm, Arrhythmia at 1060. Similarly, in the instant case the claims do not 
preempt all uses of any algorithms; rather they are limited to storage and retrieval in a 
computer memory. Therefore, appellants request reversal of the rejection of the claims 
at issue under 35 U.S.C. §101. 

II. CLAIM REJECTIONS UNDER 35 U.S.C. §1 02(e)(2) 

Claim 1 was rejected under 35 U.S.C. § 102(e)(2) as being anticipated by Piatt (U.S. 
Pat. No. 5,784,294). This rejection should be reversed because the Examiner has not 
shown that claim 1 is anticipated by Piatt. Nowhere does Piatt teach or disclose any of 
the elements of claim 1 . Piatt relates to a storage device that performs a plurality of 
functions that produce a result that can be an input to the method of claim 1 of the 
instant application but which does not anticipate the claims at issue. Piatt does not 
disclose the required mapping, generation of a key, or string the entry as required by 
claim 1. 

In the final office action, the Examiner contends that Piatt discloses at Fig. 9 "storing an 
entry comprising a molecular descriptor with a key to access it." Fig. 9 of Piatt is a flow 
chart illustrating use of descriptors. It does not relate to a mapping of descriptor vectors 
(as claimed) at all. The key generated according to claim 1 corresponds to the 
mapping. The Examiner has not shown how Fig. 9 of Piatt performs a mapping or any 
of the claimed steps. 

The Examiner further says "Piatt et al. teaches storing said first and second descriptors 
of each molecule in said series of molecules in a database for subsequent processing to 
thereby identify correspondence between molecules in said series of molecules (Claim 
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34, Lines 39-42)." That statement does not describe any of the claimed steps. The 
claimed step of "storing" relates to the entry defined in the first step. The section of 
Pratt cited has nothing to do with such an entry and hence cannot correspond to the 
claimed storing step, or any other claimed step. 

The Examiner also argues that the "key" is inherent. Again, the claimed "key 1 ' 
corresponds to the claimed mapping and the Examiner has not shown anything in Piatt 
corresponding to such a mapping. Instead, the Examiner argues that Piatt has some 
criteria for selecting molecules of the training set to be placed in a table and that this 
corresponds to "applying the mapping." The Examiner does not show how the 
placement of molecules in a table relates even remotely to applying a mapping to a 
descriptor vector as claimed and has fallen far short of the exact relationship that 
anticipation requires. 

Finally, the Examiner has erred as a matter of law by arguing that a type of data 
structure allegedly disclosed in Piatt "is consistent with" the limitation of "key indexes to 
entry for retrieval thereof." The legal test for anticipation is whether every element of a 
claim is found in an item of prior art, and not whether a structure is consistent with a 
claimed method. Therefore, appellants request reversal of the rejections under Section 
102(e). 

III. CLAIM REJECTIONS UNDER 35 U.S.C. § 112 

The final Office Action rejected claims 1 and 4, under 35 U.S.C. § 112, first paragraph, 
as containing subject matter that was not described in the original specification as to 
convey to one skilled in the art that the inventor possessed the claimed invention. The 
examiner (or the Board, if the Board is the first body to raise a particular ground for 
rejection) "bears the initial burden ... of presenting a prima facie case of 
unpatentability. " In re Oetiker , 977 F.2d 1443, 1445, 24 USPQ2d 1443, 1444 (Fed. Cir. 
1992). Insofar as the written description requirement is concerned, that burden is 
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discharged by "presenting evidence or reasons why persons skilled in the art would not 
recognize in the disclosure a description of the invention defined by the claims." In re 
Wertheim , 541 F.2d 257, 263, 191 USPQ 90, 97 (CCPA 1976). Thus, the burden placed 
on the examiner varies, depending upon what the applicant claims. The specification 
contains a description of the claimed invention, albeit not in ipsis verbis (in the identical 
words), then the examiner or Board, in order to meet the burden of proof, must provide 
reasons why one of ordinary skill in the art would not consider the description sufficient. 
Id. at 264, 191 USPQ at 98. In the present case, the amendment of November 13, 
2002, amended claim 1 so that the step of applying a mapping to the descriptor vector 
is based on pre-selected criteria. Support for the amendment does not have to be ipsis 
verbis. It is inherent from the discussion in page 40 of the specification that the 
application of the mapping is based on pre-determined criteria. Note that the discussion 
(page 40) of the "association criteria" is defined in the prior training phase and thus 
clearly the association criteria were "pre-determined." 

Claim 1 was further amended to state that "the key indexes the entry for retrieval 
thereof." It is inherent in the claimed invention that the key indexes the entry for 
retrieval of the entry. Why else would a key corresponding to a mapping be used? In 
any case, the gist of the written description requirement is to prevent an applicant from 
adding claims to subject matter that the inventor did not possess at the time of filing. 
Vas-Cath Inc., v. Mahurkar , 935 F.2d 1555, 19 USPQ2d 111 (Fed. Cir. 1991). As 
appellants noted in the amendment of November 13, 2002, the amendment was not 
made to define additional subject matter but to make clear what was already implicit. 
Applicant again poses the question - why would information be stored if not for retrieval 
thereof? 

The Examiner contends at page 6, that specific to claim 4, lines 2-3 the introduction of 
"computed from a convolution with a probe" is considered new matter because it has 
not been found in the application as filed. Appellant disagrees. The computation of a 
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convolution is found at page 29, line 15 - page 30, line 5. Line 5 is the definition of a 
convolution. In any case, a smearing function is used in a convolution. See Protein 
Chrystallography Course, page 10 (attached as Exhibit A). With respect to using a 
probe function that is also a well known concept. This methodology applied to 
electromagnetic fields was presented in J. D. Jackson's "Classical Electrodynamics, 2nd 
Ed." Wiley, NY 1975. The idea is that you have a piece of equipment (a probe) that 
you can put in the material and measure the field with. The probe has some geometric 
extent, and adds together parts of the field in the vicinity of the point identifying the 
location of the field. That type of addition is called a convolution. The instrument is 
called a probe. The response of the probe is a probe function. 

The Examiner raised at page 6, that claim 6, lines 5-6 recites " a ratio of variations ... 
discriminant criterion function" and lines 9-10 recite "a criterion function ...utilizes the 
first data and the second data", which the examiner did not find in the specification. The 
examiner noted that pages 10-11 describe "F-distributed ratios ... representing 
discrimination between groups of items" which the examiner contends to be different. 
First the language "identifying a set of component vectors that maximizes a ratio of 
variations between groups to the variations within the groups along the component 
vectors as a discriminant criterion function" is found at page 23, line 15 of the original 
specification. The computation of a component vector is shown at that line. 

Support for the language "generating a criterion function for subsets of the component 
vectors, wherein the criterion function utilizes the first data and the second data" is 
found at page 9, line 8 - page 9, line 4; page 22, lines 21- 22; and page 23, lines 16-22; 
and the step 217 of FIG. 2. 

The Examiner raised at page 6, that claim 8, lines 4-5, the limitation of "T indicates a 
transpose ... representing co-variance" has not been found in the specification. It does 
not matter that the specification has not defined these terms because those terms are 
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well known in mathematics to have the meaning shown in the claim. A copy of a page 
from J. Truss, Discrete Mathematics for Computer Scientists, 2d Ed, Addison-Westley 
(1999) is attached (Exhibit B) showing the usage of the well known symbol for a 

transpose. Moreover, that £ b is a first data representing covariance is found at page 6 

of the original specification. The computation of the covariances ("T indicates a 

transpose representing covariance") is spelled out on pg. 8 in the original 

submission, lines 3 and 6. This section describes in detail the specific covariances in 
question, what T means, etc. It even calls them covariances. 

The Examiner has not shown any reason why the language added in the amendment 
would not be supported by the specification and in fact appellants contend that the 
amendment was not made for purposes of patentability, so the invention defined by the 
claims both before and after the amendment are the same and hence was clearly in the 
possession of the inventor at the time of the filing of the application. Therefore, 
appellants request reversal of this rejection. 



In view of the foregoing, it is respectfully submitted that the application and the 
claims are in condition for allowance. Reversal of the final rejection, and allowance of 
the claims as amended, are requested. 



CONCLUSION 



Respectfully submitted, 





MichaerJ. Buchenhorner 
Registration No. 33,162 
Attorney for Applicants 
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APPENDIX 



1 . (Amended) In a data processing system wherein descriptor vectors associated 
with a plurality of regions of molecules are stored in a database, a method for 
generating and storing data characterizing at least one region of said plurality of 
regions, the method comprising the steps of: 

generating an entry comprising i) an identifier that identifies said at least one 
region, and ii) data characterizing a set of axes derived from a property distribution of 
said at least one region; 

applying a mapping to the descriptor vector associated with said at least one 
region based on preselected criteria; 

generating a key that corresponds to said mapping of the descriptor vector 
associated with said at least one region; and 

storing said entry in a memory, wherein said key is associated with said entry 
such that the key indexes the entry for retrieval thereof. 

2. (Cancelled) The method of claim 1, wherein said set of axes are invariant 
to rotation and translation of said at least one region. 

3. (Cancelled) The method of claim 2, wherein said set of axes are derived 
from principal axes of said property distribution. 

4. (Amended) The method of claim 1, wherein said property distribution of 
said at least one region is computed from a convolution with a probe function to a 
property field. 
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5. (Amended) The method of claim 1, wherein said plurality of descriptor 
vectors are classified into groups, and wherein said mapping step maps said descriptor 
vectors to a space discriminating between said groups of descriptor vectors. 

6. The method of claim 5, wherein said mapping is derived from the steps of: 
generating first data representing differences between said groups of descriptor 

vectors; 

generating second data representing variations within said groups of descriptor 
vectors; 

identifying a set of component vectors that maximizes a ratio of variations 
between groups to the variations within the groups along the component vectors as a 
discriminant criterion function; 

generating a criterion function for subsets of the component vectors, wherein the 
criterion function utilizes the first data and the second data; 

for each particular subset of component vectors, calculating a probability value 
for the criterion functions associated with the particular subset; 

selecting a probability value from probability values for said subsets of 
component vectors based upon a predetermined criterion; 

identifying the subset of component vectors associated with the selected 
probability value; and 

generating a mapping to a space corresponding to the subset of component 
vectors associated with the selected probability value, and storing the mapping for 
subsequent processing. 

7. The method of claim 6, wherein said first data comprises a matrix £ b 
representing covariance between said groups of descriptor vectors, and said second 
data comprises a matrix £ w representing covariance within said groups of descriptor 
vectors. 
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8. 

form: 



The method of claim 7, wherein said criterion function has the general 



/ (w) = C I ^7 £ ~ I 

where ™ is some vector, T indicates a transpose, £b is a first data representing 
covariance, £ w is a second data representing covariance and c is a constant based 
upon degrees of freedom in £ b and £ w . 

9. The method of claim 8, wherein C is determined as follows: 

q — 1 /degrees of freedom in g h _ 1 /(TV- 1) 
1 /degrees of freedom in e w 1/(1 m -TV) 

where N represents the number of groups of descriptor vectors, represents the 
number of regions, and I represents the sum of for the N groups. 

10. The method of claim 7, wherein the step of identifying a set of component 
vectors that maximizes an F distributed criterion function comprises the substeps of: 

determining a set of (eigenvalue, eigenvector) pairs for the matrix e w 

determining said set of component vectors based upon said set of (eigenvalue, 
eigenvector) pairs for the matrix e w . 
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11. The method of claim 10, wherein said statistic for a given subset of 
component vectors is based upon value of said criterion function for said subset of 
component vectors. 

12. The method of claim 11, wherein said statistic for a given subset of 
component vectors has the following form: 

where /* represents the value of the criterion function at a component vector in the 
given subset, 
C is a constant, 

Ls represents the number of fk values in the given subset of component vectors, and 
the I operation sums over the Ls fk values in the given subset of component vectors. 

13. The method of claim 12, wherein said a probability value for a particular F- 
distributed statistic represents a probability value that the particular F-distributed 
statistic could have been larger by chance. 

14. The method of claim 13, wherein said probability value selected from 
probability values for said subsets of component vectors is a minimum probability value 
of said probability values for said subsets of component vectors. 

15. The method of claim 6, wherein said mapping for said at least one 

descriptor vector performs a loop over each component vector belonging to the subset 

of component vectors associated with the selected probability; wherein, in each iteration 
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of said loop, dot product of said descriptor vector with a transpose of a unit vector for 
the given component vector is added to a running sum. 



# 3665603_v2 
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The convolution theorem and its applications 



• What is a convolution? 

o Convolutions are commutative 

• The convolution theorem 

o Proof of the convolution theorem 

• The correlation theorem 

o What is a correlation function? 
o Parseval ! s theorem 

• Im portant convolutions 

o Convolution with a Gaussian 
o Convolution with a delta function 

• Applications of the convolution theorem 

o Atomic scattering factors 
o B -factors 

o Diffraction from a lattice 
o Diffraction from a crystal 
o Resolution truncation 
o Density modification 

• Applications of the correlation theorem 

o The Patterson function 

o The phased translation function 



One of the most important concepts in Fourier theory, and in crystallography, is that of a convolution. 
Convolutions arise in many guises, as will be shown below. Because of a mathematical property of the 
Fourier transform, referred to as the convolution theorem, it is convenient to carry out calculations 
involving convolutions. 

But first we should define what a convolution is. Understanding the concept of a convolution operation 
is more important than understanding a proof of the convolution theorem, but it may be more difficult! 

Mathematically, a convolution is defined as the integral over all space of one function at x times another 
function at u-x. The integration is taken ove^h^ariably^wWch may be a ID or 3D variable), 



Outline 



What is a convolution? 
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typically from minus infinity to infinity over all the dimensions. So the convolution is a function of a 
new variable u, as shown in the following equations. The cross in a circle is used to indicate the 
convolution operation. 



Note that it doesn't matter which function you take first, i.e. the convolution operation is commutative. 
Well prove that below, but you should think about this in terms of the illustration below. This 
illustration shows how you can think about the convolution, as giving a weighted sum of shifted copies 
of one function: the weights are given by the function value of the second function at the shift vector. 
The top pair of graphs shows the original functions. The next three pairs of graphs show (on the left) the 
function g shifted by various values of x and, on the right, that shifted function g multiplied by f at the 
value of x. 



C(u) = f(x)®g(x)= p(x)g(u-x)dx 



space 




space 
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g<x] f(x) 
2 1 




f(x)g(u-xj convolution 




The bottom pair of graphs shows, on the left, the superposition of several weighted and shifted copies of 
g and, on the right, the integral (i.e. the sum of all the weighted, shifted copies of g). You can see that 
the biggest contribution comes from the copy shifted by 3, i.e. the position of the peak of f. 

If one of the functions is unimodal (has one peak), as in this illustration, the other function will be 
shifted by a vector equivalent to the position of the peak, and smeared out by an amount that depends on 
how sharp the peak is. But alternatively we could switch the roles of the two functions, and we would 
see that the bimodal function g has doubled the peaks of the unimodal function f. 

Convolutions are commutative 
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f(x)®g(x)=g(x)®f(x) 



We stated that the convolution integral is commutative. Here, in case you are interested, is a quick proof 
of that. First, we start with the convolution integral written one way. For convenience we will deal with 
the ID case, but the 3D case is exactly analogous. 



oo 



C(u)= ^f(x)g(u- x)dx 



— oo 



Now we substitute variables, replacing u-x with a new x\ 

x' = u — jc, dx % — —dx 



— oo 

C(iO = - Jf(K-x , )g(x , )dfe' 

oo 

Note that, because the sign of the variable of integration changed, we have to change the signs of the 
limits of integration. Because these limits are infinite, the shift of the origin (by the vector u) doesn't 
change the magnitude of the limits. 

Now we reverse the order of the limits, which changes the sign of the equation, and swap the order of 
the functions g and f. 

oo 

C(u)= jg(x')f(u-x')dx< 

— oo 

It doesn't matter whether we call the variable of integration x ! or x, so we put back x, to get the result we 
wanted to prove. 

oo 

C(u)= j g(x)f(u- x)dx 

— oo 

The convolution theorem 

Because there will be so many Fourier transforms in the rest of this presentation, it is useful to introduce 
a shorthand notation. T will be used to indicate a forward Fourier transform, and its inverse to indicate 
the inverse Fourier transform. 
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T(f (r)) = JT (r)exp(2jris ■ r)dr 

space 

There are two ways of expressing the convolution theorem: 

• The Fourier transform of a convolution is the product of the Fourier transforms. 

• The Fourier tranform of a product is the convolution of the Fourier transforms. 

T(f®g) = T(f)T(g) 
T(fg) = T(f)®T(g) 

The convolution theorem is useful, in part, because it gives us a way to simplify many calculations. 
Convolutions can be very difficult to calculate directly, but are often much easier to calculate using 
Fourier transforms and multiplication. 

Proof of the convolution theorem 

To prove the convolution theorem, in one of its statements, we start by taking the Fourier transform of a 
convolution. What we want to show is that this is equivalent to the product of the two individual Fourier 
transforms. Note, in the equation below, that the convolution integral is taken over the variable x to give 
a function of u. The Fourier transform then involves an integral over the variable u. 

T(f(A)®g(*)) 



Now we substitute a new variable w for w-jc. As above, the infinite integration limits don't change. Then 
we expand the exponential of a sum into the product of exponentials and rearrange to bring together 
expressions in x and expressions in w. 

OO CO 

T(f(jc)®g(jc))= j* ^f(x)g(w)txp[2ms(x + w)]dxdw 

— OO — OO 
OO OO 

J f ( x ) exp ( 2 m sx ) g ( w) exp ( 2 ni sw )dx dw 

— oo — OO 



= T 



oo 



Jf (x)g(u — x)dx 



\ —oo 

OO oo 



J 



= J |f(jc)g(«- x)dxexp(2msu)du 



— oo — oo 



= 
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Expressions in x can be taken out of the integral over w so that we have two separate integrals, one over 
x with no terms containing w and one over w with no terms containing x. 



CO oo 



T(f(x)®g(x)) = J f(x)cxp(2msx)dx ^g(w)cxp(2msw)dw 



— oo — oo 



The variables of integration can have any names we please, so we can now replace w with x, and we 
have the result we wanted to prove. 



oo oo 



T(f(jt)® g(x))= J f(jc)exp(2^ijjc)<ix: J g(x)cxp(27lisx)dx 



— oo — oo 



= T(f«)T(g(*)) 



If you look through the derivation above, you will see that we could have used a minus sign in the 
exponential when taking the original Fourier transform, and then the two Fourier transforms at the end 
would also contain minus signs in the exponentials. In other words, the convolution theorem applies to 
both the forward and reverse Fourier transforms. This is not surprising, since the two directions of 
Fourier transform are essentially identical. 



Proof of second statement of convolution theorem 

To prove the second statement of the convolution theorem, we start with the version we have already 
proved, i.e. that the Fourier transform of a convolution is the product of the individual Fourier 
transforms. 



T(f®g) = T(f)T(g) 

First we'll define some shorthand, where capital letters indicate the Fourier transform mates of lower 
case letters. 



F = T(f) f = T"'(F) 
G = T(g) g = T" 1 (G) 

We use these relationships to recast the statement above in terms of the Fourier tranform mates of the 
original functions. Then we take an inverse Fourier transform on each side of the equation, to get 
(essentially) the second statement of the convolution theorem. The only difference is that it is expressed 
in terms of the inverse Fourier transform. 
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t[t _1 (f)®t- | (g)] = fg 

T- 1 (F)®T- 1 (G) = T _l (F G) 



But, as we noted above, we could have proved the convolution theorem for the inverse transform in the 
same way, so we can reexpress this result in terms of the forward transform. 

T(fg) = T(f)®T(g) 

The correlation theorem 

The correlation theorem is closely related to the convolution theorem, and it also turns out to be useful 
in many computations. The correlation theorem is a result that applies to the correlation function, which 
is an integral that has a definition reminiscent of the convolution integral. 

What is a correlation function? 

In a correlation integral, instead of taking the value of one function at u-x, you take the value of that 
function at x-u. Equivalently, you take the value of the other function at x+w. This is shown in the 
following equation, along with the variable substitution that allows the two expressions to be 
interconverted. 

oo 

fog= ^f(x)g(x + u)dx; substitute x' = x+ u 



—CO 

oo 



= Jf(jc f -I#)g(^)^' 



— OO 

oo 



= jf(x—u)g(x)dx 



— oo 



The figure below illustrates why the correlation function has the name that it does. If the two functions f 
and g contain similar features, but at a different position, the correlation function will have a large value 
at a vector corresponding to the shift in the position of the feature. 
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The correlation theorem can be stated in words as follows: the Fourier tranform of a correlation integral 
is equal to the product of the complex conjugate of the Fourier transform of the first function and the 
Fourier transform of the second fUnction. The only difference with the convolution theorem is in the 
presence of a complex conjugate, which reverses the phase and corresponds to the inversion of the 
argument u-x. 

T(fog) = F*G 

Parseval's theorem 

Important convolutions 

Convolution with a Gaussian 

First we need to define a Gaussian function. We will stick, for the moment, to 1 D Gaussians. The 
Gaussian function is the familiar bell-shaped curve, with a peak position (r Q ) and standard deviation. 



p(0 = 



1 



f 



■a 



exp 



KG 



(r-r 0 ) 
2<7 2 



2\ 



V J 

We won't derive the Fourier transform of a Gaussian, but it is given by the following equation. 

T(p(r)) = exp(27T?r 0 j)exp(-27r 2 cr 2 j 2 ) 



Note that the Fourier transform of a Gaussian is another Gaussian (although lacking the normalisation 
constant). There is a phase term, corresponding to the position of the center of the Gaussian, and then 
the negative squared term in an exponential. Also notice that the standard deviation has moved from the 
denominator to the numerator. This means that, as a Gaussian in real space gets broader, the 
corresponding Gaussian in reciprocal space gets narrower, and vice versa. This makes sense, if you think 
about it: as the Gaussian in real space gets broader, contributions from points within that Gaussian start 
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to interfere with each other at lower and lower resolutions. 



Convolution with a Gaussian will shift the origin of the function to the position of the peak of the 
Gaussian, and the function will be smeared out, as illustrated above. 



Convolution with a delta function 



Delta functions have a special role in Fourier theory, so it's worth spending some time getting 
acquainted with them. A delta function is defined as being zero everywhere but for a single point, where 
it has a weight of unity. 

weight of 1 at r = r 0 
(5(r-r 0 )= _ t . 

0 elsewhere 



What it means to say that it has a weight of unity is that the integral of the delta function over all space 
is 1 . The delta function is given an argument of r-r 0 so that it can be defined as having its non-zero point 

at the origin. When r is equal to r 0 , the argument of the delta function is zero. 



J\5(r-r 0 )fr=l 

space 

A more general property of the delta function is that the integral of a delta function times some other 
function is equal to the value of that other function at the position of the delta function. 



Jf(r)<5(r-r 0 )dr=f(r 0 ) 



space 

How can a single point, with no width, breadth or depth, have a weight of one? The value of the delta 
function at that point must be a special kind of infinity, and this means that it has to be defined as a 
limit. There are a number of ways to define a delta function. One of them is to define it as an infinitely 
sharp Gaussian, The integral over all space of a Gaussian is 1, which satisfies one of the properties 
required for the delta function, and if we take the limit of a Gaussian as the standard deviation tends to 
zero, it satisfies the other properties. The following equation defines a 3D delta function as the limit of 
an isotropic 3D Gaussian. 



*(r-r 0 ) = 



lim 



1 



exp 



v 



r~r 0 

2a 2 



2\ 



J 



With this definition of the delta function, we can use the Fourier transform of a Gaussian to determine 
the Fourier transform of a delta function. As the standard deviation of a Gaussian tends to zero, its 
Fourier transform tends to have a constant magnitude of 1 . All that is left is the phase shift term. 
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T[5(r-r 0 )] = exp(2*fe.r 0 ) 

So we see that the Fourier transform of a delta function is just a phase term. Think about the picture we 
had of an electron at a point; it contributed just a phase term, with unit weight, to the diffraction pattern. 
So now we see that we can consider an electron at a point to be a delta function of electron density. 

Finally we can consider the meaning of the convolution of a function with a delta function. If we write 
down the equation for this convolution, and bear in mind the property of integrals involving the delta 
function, we see that convolution with a delta function simply shifts the origin of a function. 

J<5(r-r 0 )f(u-r>ir = f(u-r 0 ) 

space 

<5(r-r 0 )®f(r) = f(r-r 0 ) 

Applications of the convolution theorem 

Atomic scattering factors 

We have essentially seen this before. We can tabulate atomic scattering factors by working out the 
diffraction pattern of different atoms placed at the origin. Then we can apply a phase shift to place the 
density at the position of the atom. Our new interpretation of this is that we are convoluting the atomic 
density distribution with a delta function at the position of the atom. 

B-factors 

We can think of thermal motion as smearing out the position of an atom, i.e. convoluting its density by 
some smearing function. The B-factors (or atomic displacement parameters, to be precise) correspond to 
a Gaussian s mearing functio n. At resolutions typical of protein data, we are justified only in using a 
single parameter for thermal motion, which means that we assume the motion is isotropic, or equivalent 
in all directions. (In crystals that diffract to atomic resolution, more complicated models of thermal 
motion can be constructed, but we won't deal with them here.) 

Above, we worked out the Fourier transform of a ID Gaussian. 

T(p( r)) = exp(2 Ki r Q s) exp(-2 n 2 o 2 s 2 ) 

In fact, all that matters is the displacement of the atom in the direction parallel to the diffraction vector, 
so this equation is suitable for a 3D Gaussian. All we have to remember is that the term corresponding to 
the standard deviation refers only to the direction parallel to the diffraction vector. Since we are dealing 
with the isotropic case, the standard deviation (or atomic displacement) is equal in all directions. 
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The B-factor is used in an equation in terms of sinG/A, instead of the diffraction vector, because all that 
matters is the magnitude of the diffraction vector. We replace the variance (standard deviation squared) 
by the mean-square displacement of the atom in any particular direction. The B-factor can be defined in 
terms of the resulting equation. 



Note that there is a common source of misunderstanding here. The mean-square atomic displacement 
refers to displacement in any particular direction. This will be equal along orthogonal x, y and z axes. 
But often we think of the mean-square displacement as a radial measure, i.e. total distance from the 
mean position. The mean-square radial displacement will be the sum of the mean-square displacements 
along x, y and z; if these are equal it will be three times the mean-square displacement in any single 
direction. So the B-factor has a slightly different interpretation in terms of radial displacements. 




IsinO / X 







Diffraction from a lattice 



The convolution theorem can be used to explain why diffraction from a lattice gives 
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If you've gotten this far, I'm sorry that I ran out of time to complete this document before giving the 
lecture! The rest will be filled in at some point after all the lectures (and associated web pages) are 
finished. 

Diffraction from a crystal 
Resolution truncation 
Density modification 

Solvent flattening 
Sayre's equation 

Applications of the correlation theorem 

The Patterson function 

The phased translation function 

© 1999-2005 Randy J Read, University of Cambridge. Al l rights reserved. 
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The difference between this and Dijkstra's algorithm is that in choosing which 
vertex and edge to adjoin, we minimize a ijt not d } (= 4 + flf/ ). We can implement 
Prim's algorithm to take about n 2 steps in a similar way, by using two arrays, one to 
tell us the current minimum weight on an edge from i to a point of S (only relevant 
for / $ S), and another to tell us which previous vertex i is joined to; the value of 
the latter is fixed once / e 5. 



( Example 8.10 \ 
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Figure 8.34 

We apply Prim's algorithm to the edge-weighted graph shown in Figure 8.34. The 
edges given by the algorithm are shown in bold. 
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( iExlercises ^3-^-^3 
1 



Show that a relation having adjacency matrix A is 

(a) reflexive if and only if / + A = A 

(b) symmetric if and only if A T = A (where A T , the transpose of A, is obtained by 
interchanging its rows and columns) 

(c) transitive if and only if A + A 2 = A 
where 'addition' is taken as max on {0, 1}. 

Find the transitive closures of the relations and R 2 on {0, 1, 2, . . . , 20} given by 

(x t y)eRi ify = x + 3 

(x 9 y)€R 2 if y = x + 3 mod 21 
Is either of i?i and R 2 an equivalence relation? What about TC (R x ) or TC (R 2 )? 
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